Microsecond-scale Preemption for Concurrent GPU-accelerated DNN Inferences | USENIX
Transparent GPU Sharing in Container Clouds for Deep Learning Workloads | USENIX
Using CUDA IPC memory handles in pytorch - PyTorch Forums
这里面有IPC方案
Microsecond-scale Preemption for Concurrent GPU-accelerated DNN Inferences | USENIX
Transparent GPU Sharing in Container Clouds for Deep Learning Workloads | USENIX
Using CUDA IPC memory handles in pytorch - PyTorch Forums
这里面有IPC方案